Serveur d'exploration Phytophthora

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.

Identifieur interne : 000739 ( Main/Exploration ); précédent : 000738; suivant : 000740

Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.

Auteurs : Brian J. Knaus [États-Unis] ; Niklaus J. Grünwald [États-Unis]

Source :

RBID : pubmed:29706990

Abstract

Inference of copy number variation presents a technical challenge because variant callers typically require the copy number of a genome or genomic region to be known a priori. Here we present a method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package vcfR. This method is based on the relative frequency of each allele (in both genic and non-genic regions) sequenced at heterozygous positions throughout a genome. These heterozygous positions are summarized by using arbitrarily sized windows of heterozygous positions, binning the allele frequencies, and selecting the bin with the greatest abundance of positions. This provides a non-parametric summary of the frequency that alleles were sequenced at. The method is applicable to organisms that have reference genomes that consist of full chromosomes or sub-chromosomal contigs. In contrast to other software designed to detect copy number variation, our method does not rely on an assumption of base ploidy, but instead infers it. We validated these approaches with the model system of Saccharomyces cerevisiae and applied it to the oomycete Phytophthora infestans, both known to vary in copy number. This functionality has been incorporated into the current release of the R package vcfR to provide modular and flexible methods to investigate copy number variation in genomic projects.

DOI: 10.3389/fgene.2018.00123
PubMed: 29706990
PubMed Central: PMC5909048


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.</title>
<author>
<name sortKey="Knaus, Brian J" sort="Knaus, Brian J" uniqKey="Knaus B" first="Brian J" last="Knaus">Brian J. Knaus</name>
<affiliation wicri:level="2">
<nlm:affiliation>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, United States.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR</wicri:regionArea>
<placeName>
<region type="state">Oregon</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Grunwald, Niklaus J" sort="Grunwald, Niklaus J" uniqKey="Grunwald N" first="Niklaus J" last="Grünwald">Niklaus J. Grünwald</name>
<affiliation wicri:level="2">
<nlm:affiliation>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, United States.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR</wicri:regionArea>
<placeName>
<region type="state">Oregon</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2018">2018</date>
<idno type="RBID">pubmed:29706990</idno>
<idno type="pmid">29706990</idno>
<idno type="doi">10.3389/fgene.2018.00123</idno>
<idno type="pmc">PMC5909048</idno>
<idno type="wicri:Area/Main/Corpus">000763</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Corpus" wicri:corpus="PubMed">000763</idno>
<idno type="wicri:Area/Main/Curation">000763</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Curation">000763</idno>
<idno type="wicri:Area/Main/Exploration">000763</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.</title>
<author>
<name sortKey="Knaus, Brian J" sort="Knaus, Brian J" uniqKey="Knaus B" first="Brian J" last="Knaus">Brian J. Knaus</name>
<affiliation wicri:level="2">
<nlm:affiliation>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, United States.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR</wicri:regionArea>
<placeName>
<region type="state">Oregon</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Grunwald, Niklaus J" sort="Grunwald, Niklaus J" uniqKey="Grunwald N" first="Niklaus J" last="Grünwald">Niklaus J. Grünwald</name>
<affiliation wicri:level="2">
<nlm:affiliation>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, United States.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR</wicri:regionArea>
<placeName>
<region type="state">Oregon</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Frontiers in genetics</title>
<idno type="ISSN">1664-8021</idno>
<imprint>
<date when="2018" type="published">2018</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Inference of copy number variation presents a technical challenge because variant callers typically require the copy number of a genome or genomic region to be known
<i>a priori</i>
. Here we present a method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package
<i>vcfR</i>
. This method is based on the relative frequency of each allele (in both genic and non-genic regions) sequenced at heterozygous positions throughout a genome. These heterozygous positions are summarized by using arbitrarily sized windows of heterozygous positions, binning the allele frequencies, and selecting the bin with the greatest abundance of positions. This provides a non-parametric summary of the frequency that alleles were sequenced at. The method is applicable to organisms that have reference genomes that consist of full chromosomes or sub-chromosomal contigs. In contrast to other software designed to detect copy number variation, our method does not rely on an assumption of base ploidy, but instead infers it. We validated these approaches with the model system of
<i>Saccharomyces cerevisiae</i>
and applied it to the oomycete
<i>Phytophthora infestans</i>
, both known to vary in copy number. This functionality has been incorporated into the current release of the R package
<i>vcfR</i>
to provide modular and flexible methods to investigate copy number variation in genomic projects.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="PubMed-not-MEDLINE" Owner="NLM">
<PMID Version="1">29706990</PMID>
<DateRevised>
<Year>2020</Year>
<Month>09</Month>
<Day>30</Day>
</DateRevised>
<Article PubModel="Electronic-eCollection">
<Journal>
<ISSN IssnType="Print">1664-8021</ISSN>
<JournalIssue CitedMedium="Print">
<Volume>9</Volume>
<PubDate>
<Year>2018</Year>
</PubDate>
</JournalIssue>
<Title>Frontiers in genetics</Title>
<ISOAbbreviation>Front Genet</ISOAbbreviation>
</Journal>
<ArticleTitle>Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.</ArticleTitle>
<Pagination>
<MedlinePgn>123</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.3389/fgene.2018.00123</ELocationID>
<Abstract>
<AbstractText>Inference of copy number variation presents a technical challenge because variant callers typically require the copy number of a genome or genomic region to be known
<i>a priori</i>
. Here we present a method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package
<i>vcfR</i>
. This method is based on the relative frequency of each allele (in both genic and non-genic regions) sequenced at heterozygous positions throughout a genome. These heterozygous positions are summarized by using arbitrarily sized windows of heterozygous positions, binning the allele frequencies, and selecting the bin with the greatest abundance of positions. This provides a non-parametric summary of the frequency that alleles were sequenced at. The method is applicable to organisms that have reference genomes that consist of full chromosomes or sub-chromosomal contigs. In contrast to other software designed to detect copy number variation, our method does not rely on an assumption of base ploidy, but instead infers it. We validated these approaches with the model system of
<i>Saccharomyces cerevisiae</i>
and applied it to the oomycete
<i>Phytophthora infestans</i>
, both known to vary in copy number. This functionality has been incorporated into the current release of the R package
<i>vcfR</i>
to provide modular and flexible methods to investigate copy number variation in genomic projects.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Knaus</LastName>
<ForeName>Brian J</ForeName>
<Initials>BJ</Initials>
<AffiliationInfo>
<Affiliation>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, United States.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Grünwald</LastName>
<ForeName>Niklaus J</ForeName>
<Initials>NJ</Initials>
<AffiliationInfo>
<Affiliation>Horticultural Crops Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, United States.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2018</Year>
<Month>04</Month>
<Day>13</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Switzerland</Country>
<MedlineTA>Front Genet</MedlineTA>
<NlmUniqueID>101560621</NlmUniqueID>
<ISSNLinking>1664-8021</ISSNLinking>
</MedlineJournalInfo>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">Phytophthora</Keyword>
<Keyword MajorTopicYN="N">R package</Keyword>
<Keyword MajorTopicYN="N">bioinformatics</Keyword>
<Keyword MajorTopicYN="N">computational biology</Keyword>
<Keyword MajorTopicYN="N">copy number variation (CNV)</Keyword>
<Keyword MajorTopicYN="N">high throughput sequencing (HTS)</Keyword>
<Keyword MajorTopicYN="N">ploidy</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2018</Year>
<Month>01</Month>
<Day>31</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2018</Year>
<Month>03</Month>
<Day>26</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2018</Year>
<Month>5</Month>
<Day>1</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2018</Year>
<Month>5</Month>
<Day>1</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2018</Year>
<Month>5</Month>
<Day>1</Day>
<Hour>6</Hour>
<Minute>1</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">29706990</ArticleId>
<ArticleId IdType="doi">10.3389/fgene.2018.00123</ArticleId>
<ArticleId IdType="pmc">PMC5909048</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Bioinformatics. 2008 Jun 1;24(11):1403-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18397895</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Ecol. 2016 Jun;25(11):2413-26</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27065091</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2009 Sep 17;461(7262):393-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19741609</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Ecol Resour. 2017 Nov;17 (6):1156-1167</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28150424</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Ecol Resour. 2017 Jan;17 (1):54-66</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27461508</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Microbiol Spectr. 2017 Jul;5(4):null</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28752816</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genet Epidemiol. 2010 Sep;34(6):591-602</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20718045</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2007 Oct 1;23(19):2633-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17586829</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PeerJ. 2017 Oct 4;5:e3889</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">29018622</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2018 Apr 4;19(1):122</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">29618319</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2011 Dec;21(12):2224-41</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21926179</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2010 Sep;20(9):1297-303</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20644199</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2012 May 15;28(10):1307-13</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22474122</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2009 Aug 15;25(16):2078-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19505943</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Ecol Resour. 2017 Jul;17 (4):656-669</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27762098</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Sep;19(9):1586-92</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19657104</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Ecol Resour. 2017 Jan;17 (1):44-53</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27401132</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2016 Feb 11;530(7589):177-83</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26814963</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2013;8(3):e59128</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23527109</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Aug 1;27(15):2156-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21653522</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 1973 Dec;70(12):3321-3</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">4519626</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Genet. 2011 May;43(5):491-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21478889</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2011 May 04;6(5):e19379</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21573248</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Elife. 2013 May 28;2:e00731</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23741619</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Gigascience. 2013 Jul 22;2(1):10</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23870653</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Trends Biotechnol. 2000 Jun;18(6):233-42</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10802558</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Evolution. 2005 Aug;59(8):1633-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16329237</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Ecol Resour. 2017 Jan;17 (1):1-4</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27860406</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2012 May;40(9):e69</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22302147</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Brief Bioinform. 2014 Mar;15(2):256-78</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23341494</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2017 Jul;18(7):411-424</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28502977</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2001 Apr;2(4):280-91</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11283700</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>G3 (Bethesda). 2016 Aug 09;6(8):2421-34</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27317778</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2011 Jun;21(6):974-84</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21324876</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Front Genet. 2013 Dec 10;4:273</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24368910</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>G3 (Bethesda). 2014 Mar 20;4(3):389-98</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24374639</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Oregon</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="Oregon">
<name sortKey="Knaus, Brian J" sort="Knaus, Brian J" uniqKey="Knaus B" first="Brian J" last="Knaus">Brian J. Knaus</name>
</region>
<name sortKey="Grunwald, Niklaus J" sort="Grunwald, Niklaus J" uniqKey="Grunwald N" first="Niklaus J" last="Grünwald">Niklaus J. Grünwald</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Bois/explor/PhytophthoraV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000739 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000739 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Bois
   |area=    PhytophthoraV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     pubmed:29706990
   |texte=   Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:29706990" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a PhytophthoraV1 

Wicri

This area was generated with Dilib version V0.6.38.
Data generation: Fri Nov 20 11:20:57 2020. Site generation: Wed Mar 6 16:48:20 2024